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WHERE TEST SCORES AND TEACHERS' MARKS 
DISAGREE 



MARY D. LINDSAY and RUTH S. GAMSBY 1 
Pittsburgh, Pennsylvania 



A high-school teacher who has seen educational theories come 
and go has a proverbially conservative attitude toward "the 
latest fad." Such is the attitude of many minds toward mental 
testing, and the question its sponsors must answer before these 
persons will look upon it with enthusiasm is, "What is the use of 
it?" The following analysis of the results of a group test in one 
school may possibly help to answer that question. 

In November, 1920, the Terman Group Test was given to all 
the students of the Palo Alto Union High School. At the same 
time an estimate of the work of each student in each subject was 
given by the teacher in charge. These estimates were to range 
from 1 (high) to 7 (low) ; but the results showed no estimate below 
6. The tests were scored and checked, and the estimates averaged. 
Then scatter diagrams were made, showing the test scores on the 
vertical scale and on the horizontal scale the teachers' estimate 
of work. These diagrams show that while there is a tendency in 
the majority of cases for the low score to correspond to the low 
estimate of work and the high score to the high estimate, this 
correspondence does not always obtain. In a number of instances 
a high score for intelligence is accompanied by poor work, and in a 
few cases the reverse is true. The purpose of the study here 
reported was to seek an explanation for the case showing a wide 
difference between the test score and the average of teachers' 
estimates. 

From the four scatter diagrams representing about five hundred 
students, forty-six cases were chosen from observation as being out 
of the direct line of relationship. These cases are indicated by circles 

1 At the time the present study was made, the authors were graduate students in 
the Department of Education, Leland Stanford Junior University. 

679 



680 THE SCHOOL REVIEW [November 

with numbers or the letter L below them. Then the semester 
grades, given in February, were secured and averaged for each 
child. These semester grades, which ran from i to 5, were reduced 
to the scale of the former estimate, and marked on the diagram 
in such a way as to show any change in the teachers' estimate. 
The number under each dot has a corresponding number on the 
same line showing the average obtained in the actual semester 
marks. 1 The letter L under a dot means that the pupil left without 
completing the work. There was a general tendency in the case 
of those whose intelligence score had been high for the teachers' 
estimate to have more nearly approximated it by the end of the 
term. Strange to say, however, where the test score was low 
and the estimate high, the final grade tended to diverge even 
more. Another interesting fact was that of the forty-six cases 
chosen, nine, or nearly 20 per cent, had left the school without 
graduating, as opposed to 7 per cent of the entire school popula- 
tion that left without graduating. This fact alone seems to point 
to maladjustment on the part of these pupils. Two more have 
left since this study was made. 

In the cases where the semester mark tended to bring the student 
into his proper place according to the Terman Group Test, the only 
problem was to find the explanation for the original error in 
estimate. The explanations were found to be typical of most 
misunderstandings that arise from lack of acquaintance between 
pupil and teacher. In Grade IX, Case 1 looked immature and 
seemed to act so; Case 2 was deaf; Cases 4, 5, and 7 were explained 
by the maladjustment caused by entering high school direct from 
a very different sort of school; and Case 6 was so shy that she 
gave the impression of sullenness. In Grade X, Case 1, a girl, 
was suffering from the effects of adenoids; Case 2 was young and 
shy; Cases 3, 4, and 5 were large, slow pupils. In Grade XI, 
Case 2, a new boy, stammered; and Case 3 has never completed 
the work of his grade on account of illness. In Grade XII, Case 1 
had been doing a great deal of outside work during the first half 
of the semester, and had chosen for a companion a boy who was 

'No. 5 in Fig. 1, No. 8 in Fig. 2, and No. 3 in Fig. 3 have no corresponding 
numbers because the semester work was not completed at the time of the investigation. 
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known to care nothing for his school work. Cases 2 and 3 were 
new boys, the first of whom had come in with a record for work 
broken on account of ill health; the second, with a speech defect 
and a very poor muscular control; Case 4 was in love and was, 
besides, a very impractical sort of boy; Case 5 was a loafer with 
outside interests. These cases needed no further investigation as 
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. — Distribution of Grade IX students on the basis of Terman Group Test 
teachers' estimates. 



the semester mark tended to correct the former estimate, and the 
explanation of the original discrepancy seemed satisfactory. 

Where the mark failed to approximate the test score, further 
investigation was made. The estimate of teachers after three 
months of work in the second semester was obtained on work, 
intelligence, and class attitude, and an effort was made to find 
out as much as possible from the school concerning the personality 
and standing of the pupil. In many instances a Binet individual 
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test was given, partly to check up the group test score, partly to 
give a chance for a personal interview with the student in question. 

It was noticeable, in dealing with the cases whose test scores 
ran considerably higher than the school mark, that the attitude 
of the pupil as interpreted by his teachers was never entirely satis- 
factory and in some cases was very bad. It was also noticeable 
that these cases were recognized as school problems. "You are 
asking about students who have bothered us a great deal," we 
were told more than once. 

In Grade IX, Case 3 represents a boy seventeen years and 
six months old, test score 154, grade average 3.25. He was not 
interested in his work, smoked a great deal, and finally left this 
school to go back to a school he had previously attended. 

In Grade X, Case 6 represents a boy fourteen years and nine 
months old at the time the test was given. He scored 157. His 
semester mark was 4 and his present average of work, 4.25. This 
boy's I.Q., as shown by the Binet test, was slightly over no. 
His class attitude was bad, and he was disliked by his teachers. 
It was not difficult to see why. The boy's appearance, unattrac- 
tive because of a very bad complexion, was made almost repulsive 
by his slovenly and dirty clothes. He was quite uninterested in 
his work and found his chief joy in driving his father's machine 
at breakneck speed. He gave one the impression of an overgrown 
boy who would find himself in time. He had a sister in Grade 
IX (Case 8) whose score was 134, but whose semester mark was 2. 
Case 7 in Grade X represents a boy fifteen years and seven months 
of age, testing at 157 and making a final mark of 4.75. He was 
obviously lazy and has since left school to take up part-time work. 
Case 8 represents another lazy boy who has never completed the 
work to make his grade. He himself declared that his greatest 
enjoyment was dancing. Case 9, age fifteen years and one month, 
score 142, and semester mark 4, is a girl with many outside interests. 
Her mother is sick and she has to tend to the household. She 
is also ambitious socially. Case n is a girl whose present work, 
according to her teachers' estimates, tends to approach her ability 
as shown by the score. She was only fourteen years and two 
months of age when she took the Terman test, but was a big over- 
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grown girl, looking much older than her real age. Her parents 
are both college graduates, interested in her work. A Binet test 
shows her I.Q. to be 117, and there is no reason why she should not 
do satisfactory work now that she is becoming somewhat better 
poised. 

In Grade XI, a boy of eighteen years made a score of 181 and, 
when given the Binet series, passed all the tests. Yet his work is 
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Fig. 2. — Distribution of Grade X students on the basis of Terman Group Test 
scores and teachers' estimates. 



only average. He is a well-mannered quiet boy but peculiar looking 
because his face is asymmetrical. For some reason he had made 
a rather unfavorable impression on his teachers who referred to 
him as queer without being able to explain exactly why they de- 
scribed him in this way. From his own account, he had a great 
deal of outside work to do as he managed a two-acre apricot orchard 
alone. He also said that he had only recently begun to realize 
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that it was worth while to study. He is particularly interested 
in electricity. 

In the cases of low score and high grade, we gave a Binet test 
whenever there was any doubt of the group test score. But in 
practically every case the Binet score tended to verify the group 
test score; hence it was necessary to look for an explanation of 
high marks to factors other than intelligence. In Grade IX, 
Case 9 represents a girl fourteen years and six months old when 
she took the test, who made a score of 45. Yet the estimate of 
her work was 3.8 and her semester mark, 2.9. Here the group 
test score seemed to be in error, for the girl confessed that at the 
time she took the test she was very much frightened, as she had 
been told that it would affect her grade mark. During the Binet 
test she was entirely at ease and declared she enjoyed it. Yet, 
according to it, her I.Q. is 88, a proof that she is distinctly dull. 
She is an earnest white-faced wisp of a girl who works exceedingly 
hard and has won the admiration of her teachers by the persever- 
ance with which she not only pushed through her own work, but 
also enabled a nearly feeble-minded brother to carry the work of 
his grade. (Case marked L in Grade IX; score, 30; average, 
4.25). This brother has now gone into an agricultural school. 
Case 10 in Grade IX is somewhat similar to Case 9. The boy 
in question made a test score of 42 and a final grade of 2. He is 
good-looking and pleasant-mannered. His afternoons are spent 
working in a store and his evenings in study. By the industrious 
use of every minute, he manages to spend from one to two hours 
on each lesson. He disliked the group test exceedingly and felt 
that it was a ridiculous measure of intelligence. On the other 
hand, he felt that the Binet test measured him fairly. Yet his 
I.Q. was 89. Here again unusual industry has in a measure over- 
come lack of ability. Then, too, this boy is nearly seventeen, and 
his greater maturity counts in his favor in the grade of work he 
is doing. 

In Grade X, Case 12 proved rather interesting from the fact 
that the subject, a girl of average ability according to the score, 
was first marked too low and then too high. She started in as a 
new girl in the fall and worked hard. Her father is an important 
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member of the community with a great deal of political influence. 
The girl's attitude is excellent, and there is every reason why her 
teachers should wish to think well of her ability. Grade X, Case 
13 is a Chinese boy who works very hard and whose attractive 
appearance helps to make his teachers tolerant of his language 
difficulty. His test score, 112, showed fair ability and probably 
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Fig. 3. — Distribution of Grade XI students on the basis of Terman Group Test 
scores and teachers' estimates. 

did not show his full powers. The Binet test which gave him 
more time and would perhaps be a fairer test showed his I.Q. to be 
98, and as he is over twenty years of age, his mentality is quite 
sufficient explanation of his grade of 2. Case 14, with a score of 
83, made a semester average of 3, and her present average for the 
second semester is 2. Here, too, the Binet test corresponded 
rather to the Terman Group Test than to the teachers' ratings, 
for she is fifteen years of age with an I.Q. of 95. But here again 
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a very attractive appearance plus a conscientious, earnest dis- 
position and an excellent attitude toward work produce the impres- 
sion of solid worth. High marks naturally follow. 

In Grade XI, Case 5, with a score of 63, has an estimate of work 
of 3.7, only a step below average. A more detailed examination of 
her marks reveals excellent work in household arts, which tends 
to raise the average of rather poor academic marks. 

Grade XII, Case 8 shows the same situation, a test score of 
77 and a semester mark of 2.25. But the subjects taken were 
art and typewriting, and the girl was considered dull by the school. 
Case 7 is a foreigner, a Filipino. One other case in Grade XII, 
Case 6, showed a good score, 163 and a mark of 1.25, the highest 
average in the school. The girl in question is a faithful student, 
seriously crippled, a girl whom everyone likes, admires, and 
pities. 

In summarizing the details of this investigation, it was found 
that the same explanations again and again accounted for diver- 
gences. In the instances where the semester marks tended to cor- 
rect the term estimates in relation to the test score, the trouble 
lay in lack of adjustment on the part of the pupil to new conditions 
or to a physical defect of one sort or another. In the instances 
where the semester marks did not correct the estimate of work, 
if the test score was high and the estimate low, there was an impres- 
sion of either a character defect or outside interest. When the 
test score was considerably lower than the mark, the explanation 
lay in the unusual force of character, the charm of the students, 
in the fact that the pupils were foreigners and so had difficulty 
in the test from ignorance of the language, or in the selection of 
vocational, handicraft subjects. 

Of the cases chosen, twenty-six showed more ability in the test 
than their school work indicated, while the school marks of ten 
were higher than the test score in any case would predict. In 
every case where the Binet test was given, the findings tended to 
approximate those of the group test. 

What, then, is the value of this study? With large classes 
and short terms, the best of teachers make many mistakes, some 
of which are corrected by the end of the term. Here in certain 
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group tests is a criterion, the accuracy of which has been proved 
to be fairly reliable. While the tests may possibly underestimate 
the ability of certain timid pupils, they do not overestimate ability. 
The reason discovered in this high school for poor work in spite 
of good ability could in each case have been overcome if the teacher 
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Fig. 4. — Distribution of Grade XII students on the basis of Terman Group 
Test scores and teachers' estimates. 

had been sure that the student was capable of doing more work. 
On the other hand, the realization that a pupil does good work in 
spite of mediocre intelligence gives one a chance to praise and 
encourage a character of unusual worth. Surely if a group test 
of mental ability which can be given in forty minutes will give to 
high-school teachers a short cut to an accurate knowledge of what 
they have a right to expect from their students, mental tests have 
proved their value in the school and teachers will continue to use 
them. 
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